Picture for Kai Xiong

Kai Xiong

DeepTool: Scaling Interleaved Deliberation in Tool-Integrated Reasoning via Process-Supervised Reinforcement Learning

Add code
May 28, 2026
Viaarxiv icon

X-Imitator: Spatial-Aware Imitation Learning via Bidirectional Action-Pose Interaction

Add code
May 12, 2026
Viaarxiv icon

NEX: Neuron Explore-Exploit Scoring for Label-Free Chain-of-Thought Selection and Model Ranking

Add code
Feb 05, 2026
Viaarxiv icon

MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization

Add code
Jan 12, 2026
Viaarxiv icon

ARM: Role-Conditioned Neuron Transplantation for Training-Free Generalist LLM Agent Merging

Add code
Jan 12, 2026
Viaarxiv icon

Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration

Add code
Jan 12, 2026
Viaarxiv icon

Do LLMs Signal When They're Right? Evidence from Neuron Agreement

Add code
Oct 30, 2025
Viaarxiv icon

PuzzleClone: An SMT-Powered Framework for Synthesizing Verifiable Data

Add code
Aug 21, 2025
Viaarxiv icon

Com$^2$: A Causal-Guided Benchmark for Exploring Complex Commonsense Reasoning in Large Language Models

Add code
Jun 08, 2025
Viaarxiv icon

Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning

Add code
May 27, 2025
Viaarxiv icon